Skip to content

fix(mito2): schema-safe skipping index pruning#8122

Merged
evenyag merged 9 commits into
GreptimeTeam:mainfrom
fengys1996:fix/skip-index-compat
May 25, 2026
Merged

fix(mito2): schema-safe skipping index pruning#8122
evenyag merged 9 commits into
GreptimeTeam:mainfrom
fengys1996:fix/skip-index-compat

Conversation

@fengys1996
Copy link
Copy Markdown
Contributor

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

#8074 but skipping index.

What's changed and what's your intention?

This pr mainly change

  • Made skipping-index pruning schema-safe by checking predicate type compatibility per SST, preventing false negatives under schema evolution.
  • Switched from whole-applier fallback to column-subset fallback, so compatible predicates still prune while incompatible ones are skipped.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

@github-actions github-actions Bot added size/S docs-not-required This change does not impact docs. labels May 15, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces logic to handle column type changes in bloom filter indexing by verifying type compatibility between region metadata and SST metadata. It adds an SstApplyPlan to manage predicates and ensures that indexes are only applied when types match, preventing issues after an ALTER TABLE operation. Review feedback suggests optimizing the SstApplyPlan by excluding columns that are entirely missing from the SST to improve cache key consistency and reduce unnecessary map entries.

Comment thread src/mito2/src/sst/index/bloom_filter/applier.rs Outdated
@fengys1996 fengys1996 marked this pull request as ready for review May 19, 2026 14:27
@fengys1996 fengys1996 requested review from a team, evenyag, v0y4g3r and waynexia as code owners May 19, 2026 14:27
Comment thread src/mito2/src/sst/index/bloom_filter/applier.rs Outdated
Comment thread src/mito2/src/sst/index/bloom_filter/applier.rs Outdated
@fengys1996 fengys1996 force-pushed the fix/skip-index-compat branch from fdd3f00 to 59c4cd2 Compare May 24, 2026 15:41
@fengys1996 fengys1996 marked this pull request as draft May 24, 2026 16:35
@fengys1996
Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@fengys1996 fengys1996 marked this pull request as ready for review May 25, 2026 03:17
@fengys1996 fengys1996 requested a review from Copilot May 25, 2026 03:18
@fengys1996
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a compatibility verification for bloom filter predicates against SST metadata, ensuring that indexes are only applied when column types match. Key changes include the addition of compatible_predicate_for_sst to the BloomFilterIndexApplier and its integration into the Parquet row group pruning process. Review feedback identifies an opportunity to optimize this check by excluding columns missing from the SST, which would enhance caching efficiency and prevent redundant processing during index application.

Comment thread src/mito2/src/sst/index/bloom_filter/applier.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes mito2 skipping-index pruning safer across schema evolution by comparing predicate column types against each SST’s stored metadata before applying bloom-filter pruning.

Changes:

  • Adds per-SST compatible predicate selection for bloom-filter index pruning.
  • Updates reader cache-key generation and apply calls to use the compatible predicate subset.
  • Adds regression coverage for changing a skipping-indexed column type.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/mito2/src/sst/index/bloom_filter/applier.rs Adds compatible predicate filtering and updates bloom-filter apply flow.
src/mito2/src/sst/index/bloom_filter/applier/builder.rs Tracks expected predicate column types from region metadata.
src/mito2/src/sst/parquet/reader.rs Passes SST metadata into bloom pruning and computes per-SST cache keys.
src/mito2/src/sst/parquet.rs Updates tests for the new bloom predicate key flow.
tests/cases/standalone/common/alter/change_col_type_skipping_index.sql / .result Adds SQL regression test for skipping-index behavior after column type changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mito2/src/sst/index/bloom_filter/applier.rs
@github-actions github-actions Bot added size/M and removed size/S labels May 25, 2026
@fengys1996
Copy link
Copy Markdown
Contributor Author

Made some minor refactors, ptal @evenyag

@evenyag evenyag enabled auto-merge May 25, 2026 11:40
@evenyag evenyag added this pull request to the merge queue May 25, 2026
@evenyag evenyag removed this pull request from the merge queue due to a manual request May 25, 2026
@evenyag
Copy link
Copy Markdown
Contributor

evenyag commented May 25, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@evenyag evenyag added this pull request to the merge queue May 25, 2026
Merged via the queue into GreptimeTeam:main with commit 8d3ebde May 25, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants